How signature connectivity analysis is performed in iLINCS
Depending on the exact type of the query signature, the connectivity analysis with libraries of pre‐computed iLINCS signature are computed using different connectivity metric.
If the query signature is selected from iLINCS libraries of pre‐computed signatures, the connectivity with all other iLINCS signatures is pre‐computed using the extreme Pearson’s correlation signed significances of all genes, where the signed significance of a gene is equal to ‐log10(p‐value) multiplied by the sign of the log‐differential expression. If the number of overlapping genes between significance vectors of two signatures is less than 2,500, 100 overlapping genes with most positive and 100 with most negative significance value are used for the extreme Pearson’s correlations.
If the query signature is created from an iLINCS dataset, or directly uploaded by the user, the connectivity with all iLINCS signatures is calculated as the weighted correlation between the two vectors of log‐differential expressions and the vector of weights equal to [‐log10(p‐value of the query) ‐ log10(p‐value of the iLINCS signature)]. When the user‐uploaded signature consists of only log differential expression levels without p‐values, the weight for the correlation is based only on the p‐values of the iLINCS signatures [‐log10(p‐values of the iLINCS signatures)].
If the query signature uploaded by the user consists of the lists of up‐ and down‐regulated genes connectivity is calculated by assigning ‐1 to down‐regulated and +1 upregulated genes and calculating Pearson’s correlation between such vector and iLINCS signatures. The calculated statistical significance of the correlation in this case is equivalent to the t‐test for the difference between differential expression measures of iLINCS signatures between up‐ and down‐regulated genes.
If the query signature is uploaded by the user in a form of a gene list, the connectivity with iLINCS signatures is calculated as the enrichment of highly significant differential expression levels in iLINCS signature within the submitted gene list using the Random Set analysis.